Fitting Individual Countries Time Series Trajectories

Lets do some stuff from Solomon Kurz’s other bookdown, Applied Longitudinal Data Analysis in brms and the tidyverse, to see if we can visualize our target variable a bit better. I’m also using some techniques from Hadley Wickham’s nice youtube video, Managing many models with R.

#read the file, if you dont have it from the previous markdown. 
aei <- read.csv("/Volumes/RachelExternal/Thesis/Data_upload_for_CL/AEI_Std.csv")
aei <- aei[,-1]

With Priors

by_ISO <-
  aei %>%
  filter(!is.na(irrperc)) %>% 
  group_by(ISO) %>%
  nest()

Prior Prediction

Doing a little prior plotting, I’ve messed around a bit and settled on some semi-sensible priors assuming a Gaussian distribution for both the parameter and slope. I am assuming that our intercept is normally distributed with around a mean of 2 and a standard variation of 2, our beta coeff is centered around 0.01 with a sd of 0.1. This produces irrperc values within an acceptable range (roughly 0-15%). There are some negative values here. Perhaps a prior that is bounded by 0 would be a better fit for this, but experiments with log normal distributions have proved difficult. Also, none of the countries have negative trajectories of irrigation expansion, but some do have a decrease towards the end of the study period.

set.seed(17)
N <- 50
a <- rnorm(N , 2, 2)
b <- rnorm( N , 0.05, 0.05 )

plot( NULL , xlim=range(aei$yearcount) , ylim=c(-50,50) , xlab="year" , ylab="Irrigation Percentage" )
abline( h=0 , lty=2 )
for ( i in 1:N ) curve( a[i] + b[i]*x ,
from=min(aei$yearcount) , to=max(aei$yearcount) , add=TRUE , col=col.alpha("black",0.2) )

These don’t look too bad. There are some lines that predict negative values but in general they seem to be positive and have a very general upward slope.

First Model

Here were fitting the first model which is just dependent on the year count and the priors we specified above..

\[ \begin{aligned} irrperc_c &\sim N(\mu_c, \sigma_c) \\ \mu_c &= \alpha_c + \beta_c*yearcount \\ \alpha_c &\sim N(2,2) \\ \beta_c &\sim N(0.05, 0.05) \\ \sigma_c &\sim exp(1) \end{aligned} \]

I won’t use the first country, as we have 0 for the irrigation percent. AFG is the second country, and there is some evolution.

AFG_norm_yearcount <-
  brm(data = by_ISO$data[[2]], #AFG
      formula = irrperc ~ yearcount,
      control = list(adapt_delta = 0.99),
      prior = c(prior(normal(2,2), class = Intercept),
                prior(normal(0.01, 0.1), class = b),
                prior(exponential(1), class = sigma)),
      iter = 4000, chains = 4, cores = 4,
      seed = 17,
      file = "/Volumes/RachelExternal/Thesis/Thesis/fits/CL_Fits/AFG_norm_d_yearcount13")

Yep looks fine here! Check these posterior and the priors, just to see that brms is putting them in the right place…

print(AFG_norm_yearcount, digits = 4)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: irrperc ~ yearcount 
   Data: by_ISO$data[[2]] (Number of observations: 13) 
Samples: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup samples = 8000

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI   Rhat Bulk_ESS Tail_ESS
Intercept   0.7944    0.1916   0.4149   1.1780 1.0020     4598     3814
yearcount   0.0456    0.0031   0.0394   0.0515 1.0020     5025     3464

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI   Rhat Bulk_ESS Tail_ESS
sigma   0.3235    0.0774   0.2108   0.5108 1.0003     3621     3607

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
prior_summary(AFG_norm_yearcount)
             prior     class      coef group resp dpar nlpar bound       source
 normal(0.01, 0.1)         b                                               user
 normal(0.01, 0.1)         b yearcount                             (vectorized)
      normal(2, 2) Intercept                                               user
    exponential(1)     sigma                                               user

Yep, all good.

Now for the first part of the master plan, apply this to all countries, resulting in individual country trajectories, individual slopes and intercepts, but calculated only with the 8 data points for each country. Use map here to apply this to every ISO.

This runs, and for some of the models it yells that things didn’t converge.. and I’m just going to leave it, as this is not really the most important part.

Calculation of Mean Structure

Again, using the code suggested form S. Kurz, the intercept/intercept standard deviation and the rate of change/rate of change sd can be extracted from the estimates and coeffs.


mean_structure <-
  models %>% 
  mutate(coefs = map(model, ~ posterior_summary(.)[1:2, 1:2] %>% 
                       data.frame() %>% 
                       rownames_to_column("coefficients"))) %>% 
  unnest(coefs) %>% 
  select(-data, -model) %>% 
  unite(temp, Estimate, Est.Error) %>% 
  pivot_wider(names_from = coefficients,
              values_from = temp) %>% 
  separate(b_Intercept, into = c("init_stat_est", "init_stat_sd"), sep = "_") %>% 
  separate(b_yearcount, into = c("rate_change_est", "rate_change_sd"), sep = "_") %>% 
  mutate_if(is.character, ~ as.double(.) %>% round(digits = 2)) %>% 
  ungroup()

Calculation of Residual Variance

residual_variance <-
  models %>% 
  mutate(residual_variance = map_dbl(model, ~ posterior_summary(.)[3, 1])^2) %>% 
  mutate_if(is.double, round, digits = 2) %>% 
  select(ISO, residual_variance)

Calculation of Bayesian \(R^2\)

r2 <-
  models %>% 
  mutate(r2 = map_dbl(model, ~ bayes_R2(., robust = T)[1])) %>% 
  mutate_if(is.double, round, digits = 2) %>% 
  select(ISO, r2)

Fit

table <-
  models %>% 
  unnest(data) %>% 
  group_by(ISO) %>% 
  slice(1) %>% 
  select(ISO) %>% 
  left_join(mean_structure,    by = "ISO") %>% 
  left_join(residual_variance, by = "ISO") %>% 
  left_join(r2,                by = "ISO") %>% 
  rename(residual_var = residual_variance) %>% 
  select(ISO, init_stat_est:r2, everything()) %>% 
  ungroup()

table %>% 
  knitr::kable()
ISO init_stat_est init_stat_sd rate_change_est rate_change_sd residual_var r2
ABW 0.00 0.00 0.00 0.00 0.00 0.50
AFG 0.79 0.19 0.05 0.00 0.11 0.96
AGO 0.04 0.00 0.00 0.00 0.00 0.91
ALB -2.20 1.40 0.17 0.02 6.43 0.85
AND -0.09 0.07 0.00 0.00 0.02 0.53
ARE -0.37 0.45 0.02 0.01 0.60 0.57
ARG 0.25 0.04 0.00 0.00 0.00 0.86
ARM 2.86 0.68 0.09 0.01 1.38 0.87
ASM 0.00 0.00 0.00 0.00 0.00 0.50
ATG -0.09 0.13 0.01 0.00 0.05 0.47
AUS -0.06 0.03 0.01 0.00 0.00 0.94
AUT 0.30 0.14 0.01 0.00 0.06 0.61
AZE 5.17 0.75 0.13 0.01 1.64 0.94
BDI -0.15 0.08 0.01 0.00 0.02 0.88
BEL 0.29 0.12 0.00 0.00 0.04 0.39
BEN -0.03 0.01 0.00 0.00 0.00 0.84
BFA -0.02 0.01 0.00 0.00 0.00 0.77
BGD -7.27 3.51 0.26 0.07 59.18 0.58
BGR 0.18 1.91 0.07 0.03 13.29 0.28
BHR -0.02 0.69 0.04 0.01 1.51 0.60
BHS -0.03 0.02 0.00 0.00 0.00 0.73
BIH 0.03 0.01 0.00 0.00 0.00 0.09
BLR -0.09 0.10 0.01 0.00 0.03 0.79
BLZ -0.04 0.02 0.00 0.00 0.00 0.79
BMU 0.00 0.00 0.00 0.00 0.00 0.50
BOL -0.01 0.01 0.00 0.00 0.00 0.90
BRA -0.11 0.05 0.00 0.00 0.01 0.81
BRB -3.00 1.42 0.14 0.02 6.81 0.74
BRN -0.06 0.03 0.00 0.00 0.00 0.73
BTN -0.03 0.11 0.01 0.00 0.03 0.79
BWA 0.00 0.00 0.00 0.00 0.00 0.76
CAF 0.00 0.00 0.00 0.00 0.00 0.55
CAN -0.01 0.01 0.00 0.00 0.00 0.91
CHE 0.58 0.14 0.01 0.00 0.06 0.46
CHL 0.91 0.15 0.02 0.00 0.06 0.84
CHN 0.93 0.37 0.06 0.01 0.41 0.91
CIV -0.05 0.03 0.00 0.00 0.00 0.83
CMR -0.01 0.01 0.00 0.00 0.00 0.84
COD 0.00 0.00 0.00 0.00 0.00 0.76
COG 0.00 0.00 0.00 0.00 0.00 0.79
COL -0.18 0.09 0.01 0.00 0.02 0.85
COM -0.02 0.02 0.00 0.00 0.00 0.63
CPV 0.10 0.03 0.01 0.00 0.00 0.95
CRI -0.16 0.20 0.02 0.00 0.12 0.83
CUB -1.51 0.80 0.11 0.01 1.96 0.87
CYM 0.00 0.00 0.00 0.00 0.00 0.50
CYP 0.67 0.30 0.04 0.00 0.28 0.89
CZE -0.01 0.31 0.01 0.01 0.29 0.48
DEU 0.87 0.68 0.02 0.01 1.38 0.23
DJI -0.01 0.01 0.00 0.00 0.00 0.79
DMA 0.00 0.00 0.00 0.00 0.00 0.50
DNK -2.65 1.29 0.14 0.02 5.35 0.80
DOM -0.78 0.27 0.07 0.00 0.22 0.97
DZA 0.04 0.02 0.00 0.00 0.00 0.74
ECU -0.50 0.16 0.04 0.00 0.07 0.97
EGY 2.13 0.08 0.01 0.00 0.02 0.93
ERI -0.04 0.03 0.00 0.00 0.00 0.78
ESP 1.91 0.32 0.05 0.01 0.31 0.92
EST 0.00 0.06 0.00 0.00 0.01 0.26
ETH -0.05 0.04 0.00 0.00 0.00 0.77
FIN -0.05 0.04 0.00 0.00 0.00 0.75
FJI -0.04 0.03 0.00 0.00 0.00 0.70
FRA -0.44 0.49 0.05 0.01 0.70 0.82
FRO 0.00 0.00 0.00 0.00 0.00 0.50
FSM 0.00 0.00 0.00 0.00 0.00 0.50
GAB 0.00 0.00 0.00 0.00 0.00 0.78
GBR -0.09 0.10 0.01 0.00 0.03 0.84
GEO 0.76 0.30 0.06 0.01 0.27 0.95
GHA -0.03 0.02 0.00 0.00 0.00 0.77
GIB 0.00 0.00 0.00 0.00 0.00 0.50
GIN -0.04 0.04 0.00 0.00 0.00 0.87
GLP -0.70 0.51 0.03 0.01 0.76 0.65
GMB 0.07 0.02 0.00 0.00 0.00 0.53
GNB 0.19 0.04 0.01 0.00 0.01 0.92
GNQ 0.00 0.00 0.00 0.00 0.00 0.50
GRC -0.60 0.64 0.12 0.01 1.24 0.93
GRD -0.17 0.14 0.01 0.00 0.05 0.53
GRL 0.00 0.00 0.00 0.00 0.00 0.50
GTM -0.06 0.11 0.01 0.00 0.04 0.87
GUF -0.01 0.01 0.00 0.00 0.00 0.71
GUM -0.12 0.12 0.00 0.00 0.04 0.36
GUY 0.12 0.03 0.01 0.00 0.00 0.97
HND 0.01 0.04 0.01 0.00 0.01 0.93
HRV 0.01 0.04 0.00 0.00 0.01 0.18
HTI 0.01 0.15 0.04 0.00 0.06 0.97
HUN -0.16 0.72 0.04 0.01 1.58 0.55
IDN 0.84 0.21 0.02 0.00 0.14 0.83
IND 4.67 1.86 0.12 0.03 10.43 0.71
IRL 0.03 0.01 0.00 0.00 0.00 0.35
IRN 1.00 0.07 0.04 0.00 0.02 0.99
IRQ -0.46 0.76 0.08 0.01 1.84 0.82
ISL 0.00 0.00 0.00 0.00 0.00 0.50
ISR -0.06 0.83 0.11 0.01 2.15 0.87
ITA 4.29 0.36 0.10 0.01 0.39 0.97
JAM 0.94 0.10 0.02 0.00 0.03 0.93
JOR -0.03 0.08 0.01 0.00 0.02 0.85
JPN 8.12 0.21 0.00 0.00 0.13 0.06
KAZ 0.47 0.09 0.01 0.00 0.02 0.82
KEN -0.02 0.02 0.00 0.00 0.00 0.79
KGZ 1.88 0.30 0.04 0.00 0.26 0.91
KHM -0.29 0.22 0.02 0.00 0.14 0.79
KIR 0.00 0.00 0.00 0.00 0.00 0.50
KNA 0.05 0.00 0.00 0.00 0.00 0.73
KOR 3.02 0.56 0.08 0.01 0.94 0.88
KWT -0.11 0.08 0.00 0.00 0.02 0.61
LAO -0.27 0.18 0.01 0.00 0.10 0.69
LBN -0.15 0.59 0.11 0.01 1.05 0.93
LBR -0.01 0.00 0.00 0.00 0.00 0.84
LBY -0.02 0.03 0.00 0.00 0.00 0.83
LCA -1.23 0.68 0.06 0.01 1.41 0.76
LIE 0.00 0.00 0.00 0.00 0.00 0.50
LKA 1.87 0.24 0.08 0.00 0.17 0.98
LSO -0.02 0.01 0.00 0.00 0.00 0.81
LTU -0.01 0.13 0.00 0.00 0.05 0.28
LUX 1.28 0.25 -0.02 0.00 0.19 0.66
LVA 0.01 0.07 0.00 0.00 0.01 0.24
MAR 0.96 0.10 0.02 0.00 0.03 0.96
MCO 0.00 0.00 0.00 0.00 0.00 0.50
MDA -1.99 0.81 0.12 0.01 2.02 0.90
MDG -0.20 0.17 0.02 0.00 0.08 0.88
MDV 0.00 0.00 0.00 0.00 0.00 0.50
MEX 0.25 0.16 0.03 0.00 0.07 0.95
MHL 0.00 0.00 0.00 0.00 0.00 0.50
MKD -1.02 0.56 0.07 0.01 0.93 0.87
MLI -0.02 0.03 0.00 0.00 0.00 0.58
MLT -0.67 0.92 0.06 0.01 2.64 0.65
MMR -0.52 0.25 0.03 0.00 0.18 0.89
MNG -0.01 0.01 0.00 0.00 0.00 0.76
MOZ 0.00 0.01 0.00 0.00 0.00 0.91
MRT 0.01 0.00 0.00 0.00 0.00 0.85
MTQ -0.38 0.69 0.06 0.01 1.43 0.72
MUS -1.05 0.49 0.12 0.01 0.69 0.96
MWI -0.14 0.09 0.01 0.00 0.02 0.67
MYS 0.32 0.03 0.01 0.00 0.00 0.98
NAM 0.00 0.00 0.00 0.00 0.00 0.85
NCL -0.10 0.07 0.00 0.00 0.01 0.70
NER -0.01 0.01 0.00 0.00 0.00 0.75
NGA 0.15 0.02 0.00 0.00 0.00 0.79
NIC -0.06 0.06 0.01 0.00 0.01 0.88
NLD 2.14 2.07 0.12 0.03 14.82 0.54
NOR -0.04 0.04 0.00 0.00 0.01 0.77
NPL -2.06 0.99 0.10 0.02 3.02 0.78
NRU 0.00 0.00 0.00 0.00 0.00 0.50
NZL 0.00 0.18 0.02 0.00 0.10 0.89
OMN 0.03 0.02 0.00 0.00 0.00 0.78
PAK -2.97 1.95 0.25 0.03 12.77 0.89
PAN 0.03 0.04 0.01 0.00 0.00 0.89
PER 0.21 0.09 0.01 0.00 0.02 0.91
PHL -0.41 0.33 0.07 0.01 0.33 0.94
PLW 0.00 0.00 0.00 0.00 0.00 0.50
PNG 0.00 0.00 0.00 0.00 0.00 0.50
POL 0.03 0.19 0.01 0.00 0.11 0.46
PRI 1.63 0.49 0.02 0.01 0.71 0.52
PRK -0.18 0.98 0.12 0.02 3.24 0.85
PRT 2.54 0.83 0.07 0.01 2.09 0.76
PRY 0.00 0.01 0.00 0.00 0.00 0.95
PSE 0.20 0.28 0.04 0.00 0.23 0.90
PYF -0.08 0.06 0.00 0.00 0.01 0.53
QAT -0.29 0.18 0.01 0.00 0.09 0.68
REU -0.73 0.43 0.05 0.01 0.56 0.87
ROU -2.46 2.03 0.13 0.03 14.31 0.54
RUS -0.03 0.05 0.00 0.00 0.01 0.64
RWA -0.06 0.04 0.00 0.00 0.00 0.82
SAU -0.11 0.10 0.01 0.00 0.03 0.79
SDN 0.03 0.07 0.01 0.00 0.01 0.93
SEN 0.19 0.04 0.00 0.00 0.00 0.81
SGP 0.00 0.00 0.00 0.00 0.00 0.50
SLB 0.00 0.00 0.00 0.00 0.00 0.50
SLE -0.10 0.06 0.01 0.00 0.01 0.80
SLV -0.46 0.20 0.03 0.00 0.12 0.91
SMK -0.05 0.29 0.02 0.00 0.27 0.63
SMR 0.00 0.00 0.00 0.00 0.00 0.50
SOM 0.04 0.03 0.00 0.00 0.00 0.85
STP 4.72 0.62 0.07 0.01 1.20 0.83
SUR -0.03 0.02 0.00 0.00 0.00 0.92
SVK -0.99 0.56 0.05 0.01 1.00 0.75
SVN -0.09 0.13 0.01 0.00 0.05 0.43
SWE 0.03 0.06 0.00 0.00 0.01 0.51
SWZ -0.55 0.30 0.04 0.00 0.27 0.88
SYC -0.11 0.12 0.00 0.00 0.04 0.29
SYR -0.73 0.61 0.07 0.01 1.14 0.85
TCA 0.00 0.00 0.00 0.00 0.00 0.50
TCD 0.00 0.00 0.00 0.00 0.00 0.80
TGO -0.02 0.02 0.00 0.00 0.00 0.64
THA -1.44 0.77 0.12 0.01 1.90 0.90
TJK 0.71 0.19 0.05 0.00 0.11 0.97
TKM -0.36 0.35 0.04 0.01 0.35 0.86
TLS 0.12 0.24 0.02 0.00 0.16 0.69
TON 0.00 0.00 0.00 0.00 0.00 0.50
TTO 0.00 0.06 0.01 0.00 0.01 0.91
TUN 0.13 0.25 0.02 0.00 0.18 0.81
TUR -0.46 0.56 0.07 0.01 1.00 0.84
TUV 0.00 0.00 0.00 0.00 0.00 0.50
TWN 9.67 1.33 0.05 0.02 4.65 0.46
TZA -0.03 0.03 0.00 0.00 0.00 0.80
UGA -0.01 0.01 0.00 0.00 0.00 0.86
UKR -0.95 0.46 0.06 0.01 0.64 0.87
URY -0.30 0.16 0.01 0.00 0.08 0.78
USA 0.72 0.10 0.03 0.00 0.03 0.97
UZB 2.21 0.39 0.09 0.01 0.46 0.95
VCT -0.62 0.41 0.04 0.01 0.50 0.78
VEN -0.06 0.05 0.01 0.00 0.01 0.90
VGB 0.00 0.00 0.00 0.00 0.00 0.50
VIR -0.10 0.07 0.00 0.00 0.02 0.67
VNM -0.99 1.16 0.12 0.02 4.39 0.78
VUT 0.00 0.00 0.00 0.00 0.00 0.50
WSM 0.00 0.00 0.00 0.00 0.00 0.50
YEM 0.08 0.10 0.01 0.00 0.03 0.77
ZAF 0.08 0.03 0.01 0.00 0.00 0.99
ZMB -0.05 0.04 0.00 0.00 0.00 0.58
ZWE -0.10 0.05 0.00 0.00 0.01 0.84

Plotting

ggsave("/Volumes/RachelExternal/Thesis/Thesis/abline_95.png", abline, height = 200,limitsize = FALSE, dpi = 300 )
Saving 7 x 200 in image

Individual Country Fits

Some of these countries don’t have fantastic fits. ALB, BGD, BGR, BRB, CUB, DNK, IND, NLD, PAK, ROU, have some issues.


Using No Priors

What if I do this again with no priors… and see if it fits better for the countries with more extreme increases in irrigated area over time. I’ve run the same setup as above with the calculation of the mean, variance and bayesian \(R^2\). Below I’ve graphed the fits for the problem countries (ALB, BGD, BGR, BRB, CUB, DNK, IND, NLD, PAK, ROU).

The No Prior Model

Plotting

Saving 7 x 7 in image

Individual Country Fits with no priors specified

Whoa, ok so these countries are fit way way better. This makes sense though, because if the model is being refit for each country then the priors can fluctuate much more. in the first run, I limited the priors to a pretty narrow slope that wasn’t helpful for countries that have expansion trajectories that increase quicker than the range I specified in the prior. If brms is behaving like I expect it to behave, its fitting a uniform prior for the slope and a student t distribution for the intercept, FOR EACH COUNTRY. I could go back and weaken the priors chosen for fit1 but what I’m doing here is not that important.

Prior vs. No Prior Fits

Comparison of Tables

Lets check the table, first the problem countries fit without a prior.

And now the original fit, with the priors.

Yeah for all of these countries the fit has been improved, by visual inspection and the bayes \(R^2\) by disregarding the priors and allowing the model to fit with default priors for each country.


Comparison Visually

We can use these individual fits to look at variances between regions.

irrperc_fitted <-
  mean_structure %>% 
  mutate(`1910` = init_stat_est + rate_change_est * 0,
         `2005` = init_stat_est + rate_change_est * 95) %>% 
  select(ISO, `1910`, `2005`) %>% 
  pivot_longer(-ISO, 
               names_to = "year", 
               values_to = "irrperc") %>% 
  mutate(year = as.integer(year))



irrperc_fitted_no_prior <-
  mean_structure_noprior %>% 
  mutate(`1910` = init_stat_est + rate_change_est * 0,
         `2005` = init_stat_est + rate_change_est * 95) %>% 
  select(ISO, `1910`, `2005`) %>% 
  pivot_longer(-ISO, 
               names_to = "year", 
               values_to = "irrperc") %>% 
  mutate(year = as.integer(year))

we can quickly put some regions here and plot things by regions.

Not too much changes here between these two graphs. You can see that some of these slopes are terribly wrong, as their intercepts are negative which is nonsensical. In addition the regularization of the fits can be seem from chart to chart, particularly with those countries that had a bigger rate of change (slope) value.